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Random walk methods are used to calculate the moments of negative image equilibrium distri- 
butions in synaptic weight dynamics governed by spike-timing dependent plasticity (STDP). The 
neural architecture of the model is based on the electrosensory lateral line lobe (ELL) of mormyrid 
electric fish, which forms a negative image of the reafferent signal from the fish's own electric dis- 
charge to optimize detection of sensory electric fields. Of particular behavioral importance to the 
fish is the variance of the equilibrium postsynaptic potential in the presence of noise, which is de- 
termined by the variance of the equilibrium weight distribution. Recurrence relations are derived 
for the moments of the equilibrium weight distribution, for arbitrary postsynaptic potential func- 
tions and arbitrary learning rules. For the case of homogeneous network parameters, explicit closed 
form solutions are developed for the covariances of the synaptic weight and postsynaptic potential 
distributions. 
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I. INTRODUCTION 

Activity dependent synaptic plasticity is believed to be 
a fundamental mechanism for learning and adaptation in 
neural systems. 0]. Experimental observation of plas- 
ticity depending on mean spike rate 0, Q led to rate- 
based models, in which the changes in synaptic weight 
depend on correlations in the mean spike rate of presy- 
naptic and postsynaptic cells 0, 0. Since mean spike 
rates are necessarily averages over time windows con- 
taining many spikes, the timing of individual spikes is 
ignored in rate-based models. More recent experimental 
work 0, has shown that in some systems plasticity 
does depend on the precise timing of individual spikes. 
Models of such spike-timing dependent plasticity (STDP) 
[3 assume the weight change due to each presynaptic and 
postsynaptic spike pair is given by some function of the 
time between them, called the spike-timing dependent 
learning rule ED, El 111 III 111 El- Changes due to 
all pairs of presynaptic and postsynaptic spike pairs are 
then summed to give the weight change due to presynap- 
tic and postsynaptic spike trains. 

One system in which STDP has been found experi- 
mentally is the electrosensory lateral line lobe (ELL), 
a cerebellum-like structure in mormyrid electric fish [3. 
The mormyrid detects objects in its environment by emit- 
ting a pulsed electrical discharge and observing the per- 
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turbations to the resulting electrical field at the skin 
surface due to sensory objects. To null out the pre- 
dictable sensory input due solely to its own discharge, 
the mormyrid employs an efference copy of the motor 
command which initiates the discharge. An array of time- 
delayed, time-locked copies of the motor command inner- 
vates medium ganglion (MG) cells in ELL through plas- 
tic synapses. The MG cells also receive primary afferent 
input from electroreceptors on the skin, through nonplas- 
tic synapses. The plastic synapses whose input is time- 
locked to the motor command enable the formation and 
maintenance of a negative image |l6l | of the primary af- 
ferent signal, via a spike-timing dependent learning rule. 
The negative image effectively nulls out, in the MG cells, 
the sensory effect of the fish's own discharge, which sim- 
plifies the detection of perturbations due to sensory ob- 
jects. Plasticity is critical to maintaining the negative 
image during ongoing changes in the precise form of the 
discharge due to large daily or seasonal fluctuations in 
water conductivity, or to changes in body size and shape 
during growth and development. 

In order for the negative image to be maintainable in 
this way, the synaptic weight configuration giving rise to 
the negative image must be a stable equilibrium for the 
mean weight dynamics induced by the spike-timing de- 
pendent learning rule. Conditions for existence and sta- 
bility of such negative image equilibria were first explored 
in |l7l | , and extended to arbitrary spike-timing dependent 
learning rules and arbitrary postsynaptic potential func- 
tions in E3- 

The equilibrium weight distribution in the presence of 
noise is also behaviorally important. Fluctuations in the 
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weights due to noise lead to fluctuations in the nega- 
tive image. For example, we will show in this paper 
that the variance of the equilibrium weight distribution 
is proportional to learning rate (i.e. to the magnitude of 
the weight changes induced by individual spikes or spike 
pairs). A slow learning rate leads to a small variance 
in equilibrium weight distribution and hence a more ac- 
curate negative image; a fast learning rate gives a large 
variance in equilibrium weight distribution and a less ac- 
curate negative image. Detectability of sensory objects 
is improved by a more accurate negative image; thus to 
optimize detectability the learning rate should be slow. 
However, if the fish's own discharge is changing (due to 
changes in water conductivity or body shape, for exam- 
ple) then the negative image must be updated to remain 
accurate. Such adaptability of the negative image favors 
a fast learning rate, to allow the negative image to keep 
up with changes in the discharge. The twin requirements 
of detectability and adaptability are thus in direct con- 
flict: any one choice of learning rate represents a com- 
promise between them. A natural hypothesis is that the 
learning rate in mormyrid ELL is the slowest learning 
rate sufficient to provide adaptability of the negative im- 
age on timescales over which the fish's discharge varies in 
the wild. A faster rate would not significantly improve 
adaptability and would degrade detectability; a slower 
rate would unacceptably degrade adaptability. 

In the present paper we seek to lay the groundwork 
for the analysis of such issues in a rigorous mathematical 
fremework, by deriving analytic expressions for the mo- 
ments of the equilibrium weight distribution (when it ex- 
ists) for arbitrary spike-timing dependent learning rules 
and arbitrary postsynaptic potential functions. We work 
with a model based on mormyrid ELL, but the technique 
is applicable to any network architecture. The approach 
used is to express the weight dynamics as a discrete time, 
inhomogeneous random walk. From the master equa- 
tion of this walk we derive a differential equation for the 
Fourier transform of the equilibrium weight distribution. 
Taylor expansion of this equation yields recurrence rela- 
tions for the moments. 

Random walks have been used extensivel y t o model 
other physical systems (see the bibliography [13), and a 
large body of mathematical technique has been developed 
for their analysis pol | . But they have not previously been 
applied to STDP, where the standard a ppro ach has been 
to use the Fokker-Planck equation TT^ [l5| . Given 
that the Fokker-Planck equation is at best an approx- 
imation^ when applied to discrete stochastic processes 
[2l| . whereas random walk methods are exact, we believe 
it would be prudent to explore the utility of random walk 
methods for the analysis of STDP. 



Moreover, the conditions under which the approximation is a 
good one, especiall y f or the nonlinear Fokker-Planck equation, 
are far from clear l2ll . Further discussion of this issue, in the 
context of STDP, will be the subject of a future paper. 



The structure of the paper is as follows. In Section lU 
we summarize the background facts about random walks, 
master equations, and characteristic functions that will 
be used in the present paper. In Section IIIII we de- 
scribe the architecture and dynamical assumptions of the 
model, and in Section llVI we derive the random walk for 
the weight dynamics, for arbitrary system parameters. 
In Section we illustrate the method for deriving recur- 
rence relations for the moments of the equilibrium weight 
distribution by applying the method in the simplest pos- 
sible setting, the case of a single synaptic weight. We 
then in Section IVII apply the method to the full archi- 
tecture, with arbitrary system parameters. In Section 
IVIII we specialize to the case of homogeneous system pa- 
rameters, deriving more explicit analytical results for the 
covariance of the equilibrium weight and postsynaptic 
potential distributions. Finally in Section rVIIII we com- 
pute the weight and postsynaptic potential covariances 
for several examples of biological interest, and compare 
our predictions with Monte Carlo simulations. 

II. RANDOM WALKS, MASTER EQUATIONS, 
AND CHARACTERISTIC FUNCTIONS 

The term random walk refers to any stochastic process 
in which the state variables change only at discrete times. 
The changes in state variables are called steps; from any 
given position there is a set of possible steps, each having 
a certain probability (or probability density). The set of 
possible steps may be discrete or continuous, and both 
the step values and step probabilities may depend on 
position. 

Random walks are natural models for systems having 
temporally discrete dynamics. They are natural models 
for STDP because weight changes in STDP are due to 
temporally discrete events (spikes or spike pairs). 

Suppose a state variable w undergoes a random walk. 
Let the possible steps from position w be jw{x), for x in 
some index set X. Let the step jw{x) occur with proba- 
bility density Pw{x) in x. Let Pn{w) be the probability 
distribution for w after n steps. We wish to derive the 
equation of motion for Pn{w), usually referred to as the 
master equation. 

If the state variable is w' after n steps and w after 71 + 1 
steps, then w = w' +j{x, w') for some x. The probability 
for the state variable to be between w and w + dw after 
n + 1 steps is therefore 

Pn+l{w)dw ^ j dx Pw{x)[Pn[w')dw']. 

Hence the master equation is 

[ . dw' 

Pn+iyw) = I dx Pyj:{x)Pn{W ) — . 

The quantity dw' /dw compensates for any change in the 
density of states from time n to time n+1, due to position 
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dependence of the set of step values. From w = w' 
j{x, w') we have 

dw' 
dw 



and hence the master equation is 

Pn+l{w) = / dx Pi^'{x)Pn{w')- 



(1) 



d 

dw 



-jw' [x) 



(2) 



Suppose the set of step values is independent of position; 
then djut'{x)/dw' = 0, and the density of states factor 
in the master equation is identically 1. Denoting by j{x) 
the common set of step values, we also have w' explicitly 
in terms of w and x: w' = w — j{x). For such walks the 
master equation takes the simpler form 



Pn+l{w) ^ j dx Pw_j(^a:)ix)Pniw - j{x)). (3) 

All walks considered in the present paper will turn out 
to be of this type. 

A probability distribution P{w) is an equilibrium (sta- 
tionary) distribution for the random walk if P„ = P im- 
plies Pn+i = P] in other words, the dynamics of the walk 
leave P unchanged. Hence P{w) is an equilibrium dis- 
tribution for the walk Eq. Q if and only if it satisfies 



(4) 



P{w) = / dx p^^j(x){x)P{w ~ j{x)). 



To calculate the moments of a probability distribution 
P{w), we will find it useful to invoke a property of its 
Fourier transform (often referred to as the characteristic 
function) 



P{k) ^ I dwe'^'"P{w) 



(5) 



Taking the derivative with respect to k in Eq. jSJ and 
evaluating at A: = yields 



dw (TO)"e*'='"P(u;))|^^p 
I" I dww"'P{w) 



d"P(fc) I 



= ^-{w"). (6) 

Hence the moments of P{w) are, up to powers of i, just 
the derivatives of the characteristic function P(fc) evalu- 
ated at fc = 0. 

For further background on random walks, see pol |. 

III. FRAMEWORK 

The model consists of a single postsynaptic cell (rep- 
resenting an MG cell) driven by a repeated sensory in- 
put (primary sensory reafference), an array of presynap- 
tic cells whose spikes are time-locked to the repeated 
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FIG. 1: Schematic of the architecture. The postsynaptic cell 
receives inputs from A'^ presynaptic neurons, a repeated sen- 
sory input 4>{x), and a noisy input. Presynaptic cell i spikes 
at time Xi in each period of (f>, and has synaptic weight Wi 
onto the postsynaptic cell. 



sensory input (the efference copy of the motor com- 
mand}, and noise (representing other unspecified inputs) 
m [iH 113 (Fig. III). This basic architecture is de- 
rived from mormyrid ELL, but is sufficiently general to 
capture the dynamics of other neural systems hypothe- 
sized to have an array of time-delayed, time-locked inputs 
through plastic synapses . 

The framework for the neural dynamics is the spike re- 
sponse (SR) model [2^128^ . without refractoriness. In SR 
models the effect of a presynaptic spike on a postsynap- 
tic cell is add to the postsynaptic membrane potential a 
contribution given by the product of the synaptic weight 
and a postsynaptic potential function (PSP), which is a 
function of time after the spike. Spike response models 
include le aky integrate-and-fire (LIF) models as a spe- 
cial case [23, and are used here because they simplify 
the derivation of analytic results. 

The repeated sensory input is the postsynaptic poten- 
tial (PSP) in the postsynaptic cell due to primary sensory 
afferents, over a single EOD sweep. Each time-locked 
presynaptic cell i spikes (exactly once) at a fixed time 
within each sweep of the repeated sensory input, causing 
a corrsponding PSP in the postsynaptic cell. 

The total membrane potential in the postsynaptic cell 
is the sum of the repeated sensory input, the noisy in- 
put, and the PSPs due to time-locked presynaptic spikes, 
weighted by synaptic efficacies (weights) Wi. This mem- 
brane potential causes the postsynaptic cell to generate 
broad dendritic spikes^ at a certain (noisy) rate. We 



The postsynaptic cell also generates simple spikes, but these are 
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assume that each presynaptic spike causes a constant 
change in the weight Wi (nonassociative learning), and 
each postsynaptic and presynaptic spike pair causes a 
change in Wi according to a spike-timing dependent learn- 
ing rule, i.e. a function of the time difference between the 
postsynaptic and presynaptic spikes (associative learn- 
ing). 

The repeated sensory input has the form of a stereo- 
typed pulse with variable intcrpulsc interval. It has 
been found that the time-locked inputs occur for approx- 
imately the duration of the pulse, and are absent during 
interpulse intervals 0. The events which affect plastic- 
ity are thus restricted to the duration of the pulses, pro- 
vided the width of the learning rule is much less than the 
width of a pulse (a requirement we will impose below). 
When calculating the weight changes due to plasticity we 
may therefore omit the variable interpulse intervals, and 
replace the repeated sensory input by a periodic input 
obtained by concatenation of the pulses. 

Let the resulting period (pulse width) be T, and intro- 
duce two time variables: x € [0, T) for the time within 
each period of the sensory input, and t = nT, n G Z 
for the time of initiation of each such period |3 US ■ 
General dynamical quantities will be functions of the pair 
(x, t). The time-locked presynaptic cell i spikes at a fixed 
time in each period. Denote this time by Xi. Let Wi{x, t) 
be the synaptic weight of presynaptic cell i, and let £i{s) 
be the PSP evoked by a spike in cell i at time s after 
the spike. We will assume £i is causal: £i{s) = for 
s < 0. Let ai be the nonassociative weight change due 
to a presynaptic spike by cell i, and Ci{s) the associa- 
tive weight change due to a postsynaptic spike at time s 
after a presynaptic spike by cell i. Let (t>{x) be the pe- 
riodic sensory input, and U{x,t) the total postsynaptic 
potential due to the non-noisy inputs. 

We will assume that in each period of cf), either zero or 
one postsynaptic spike occurs. The probability density 
(in a;, for a given t) for a postsynaptic spike to occur at 
(x, t) is assumed to be ^f{U (.t, t)), for some positive and 
strictly increasing function / : M [0, 1]. The probabil- 
ity of zero postsynaptic spikes in the period beginning at 
t is then 1 — ^ dx f{U{x,t)). Heuristically, the func- 
tion / is the effective gain function of the postsynaptic 
cell in the presence of the noisy inputs, with the maxi- 
mum slope of / indicating the noise level: high or low 
noise correspond to an / with small or large maximum 
slope respectively. 

We assume that the period of (j> is sufficiently long that 
refractoriness can be ignored. In each period there is 
exactly one spike by each presynaptic cell and at most 
one spike by the postsynaptic cell, so if the period of 
is longer than the refractory period of all cells involved 



not relevant for plasticity and no use is made of tlicm in the 
present model. In this paper the phrase "postsynaptic spike" 
refers solely to broad, dendritic spikes. 



then refractoriness is irrelevant and can be omitted from 
the model. 

We will implement changes in weights as discrete steps 
with no internal time course. We update weights syn- 
chronously, once per sweep of the periodic sensory input, 
at time a; = for each t = nT, n e Z. The value of 
Wi in the period beginning at (0, t) is then independent 
of X, and will be denoted Wi{t). For synchronous up- 
dating to be a good approximation, weight changes per 
cycle must be small relative to the weights themselves ~ 
the slow learning rate assumption. Changes in weights 
due to different spikes and spike pairs will be summed 
linearly. 

In biological systems, synaptic weights have bounded 
magnitude and never change sign (Dale's Law). We im- 
pose no such boundary conditions in the present model, 
but the results still apply to the biological case provided 
the weight equilibria and equilibrium variances are such 
that weights are almost always in the region enclosed by 
biological bounds. 

To simplify the derivation of the weight dynamics, we 
will assume that £i{s),£i{s) are zero or negligible for 
|s| > te,tl respectively, with te,tl <C T. Wc will 
also impose the slow learning rate assumption: T <^ t^, 
where is the time-scale over which weights undergo 
significant relative change. The existence of approximate 
negative image states requires that the spacing of 
presynaptic spike times be much smaller than the widths 
of Si and Ci'. S ^ te,tl. These time-scale assumptions 
can be summarized as 

s < {te,tl) < r< t^. 

Typical values from mormyrid ELL are: 6 < 1ms |30j . 
[C.C. Bell, private communication], te ~ 20ms 0i '''l ~ 
40ms ~ 80ms [C.C. Bell, private communication], 

IV. WEIGHT DYNAMICS 

We now derive the random walk for the weight dynam- 
ics, by computing the possible weight changes Awi{t) = 
Wi{t + T) — Wi{t) and their corresponding probabilities. 

The nonassociative change in Wi{t) due to the single 
presynaptic spike at {xi , t) is ai . For the associative 
change due to presynaptic and postsynaptic spike pairs, 
consider the effect of a single postsynaptic spike at (x, t). 
The pair consisting of this spike and the presynaptic spike 
at {xi,t) causes a change Ci(x — Xi) in Wi. To account 
for pairs which straddle a period boundary, we also in- 
clude the pairing with presynaptic spikes at {xi,t — T) 
and {xi,t + T), for a total change of 

Ci{x ~ Xi~T) + Ci{x - Xi) + Ci{x - Xi + T). (7) 

For our intended biological application, where <^ T, 
at most one of the above terms is non-negligible, but all 
must be included to properly handle cases where x ^ Xi 
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FIG. 2: Changes in weight due to pairing of presynaptic and 
postsynaptic spikes, (a) Pairing of a postsynaptic spike at 
time (a;, t) and presynaptic spike by cell i at time {xi,t) causes 
a change Ci{x — Xi) in weight Wi. (b) For x within tl of a 
period boundary, we must include pairing with presynaptic 
spikes in the neighboring period. Pairing of a postsynaptic 
spike at time {x,t) and presynaptic spike by cell i at time 
{xi,t + T) cause a change dix — Xi — T) in weight Wi. 
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FIG. 3: Postsynaptic potential due to presynaptic spikes, (a) 
Potential at time {x,t) due to presynaptic spike by cell i at 
time {xi,t) is Wi{t)£i{x — Xi). (b) For x within te of 0, we 
must include the potential due to presynaptic spikes in the 
preceding period. The potential at time {x,t) due to the 
presynaptic spike by cell i at time {xi,t — T) is Wi{t — T)£i{x — 
Xi+T). 



is within tl of T or -T (Fig. Finally, tl^T allows 
us to approximate Eq. (jT)) by 

^ o 

Ci{x - Xi - nT) = Ci{x - Xi), (8) 

n— — oo 

o 

where £,;(s) = X]n=-oo ^ nT) is the periodization of 
Ci with period T . 

Quantity (jHJ is the change in weight Wilt) due to a 
single postsynaptic spike at {x,t). A postsynaptic spike 
between t and t + T occurs with a probability density 
^f{U{x, t)) in X, with the probability of zero postsynap- 
tic spikes being 1 — ^ Jq dx f{U{x, t)). Hence the change 
in Wi due to postsynaptic spikes between t and t + T is 

o 

Ci{x) with density ^f{U{x, t)) in x, and with probabil- 
ity 1 ^ y Jo dx f{U{x,t)). The total change in Wiit) 
due to both nonassociative and associative learning is 
therefore 

Aw,{t) ^ +'^'■(2^): density f{U{x,t)) 

{ai, probability ^ — ^ Jq dxf{U{x,t)). 

(9) 

We now compute the non-noisy component of the post- 
synaptic potential, U(x,t). The contribution to U{x,t) 
from the presynaptic spike by cell i at time {xi,t — nT) 
is Wi(t + nT)£i{x — Xi + nT). For te -^T this quantity 
is non-negligible for at most one value of n, either the 
current period {n = 0) or the previous period (n = — 1). 



But to handle edge effects (Fig. O we must include both, 
for a total contribution of 

Wi{t - T)£i{x - Xi-T) + Wi{t)£i{x ~ Xi). (10) 

The slow learning rate assumption allows us to approx- 
imate quantity (|10|l by 

Wiit) [£i{x - Xi-T) + £i{x - Xi)] . (11) 

Finally, te <^ T allows us to approximate quantity l|ll|) 
by 

oo 

o 

Wi{t) 22 Siix - Xi - nT) = Wi{t)£i{x - Xi), (12) 

n— — oo 

o 

where £i{s) — X]n=-cx) ^«(^ ~ '^-^) ^^^'^ periodization of 
£i with period T. 

Quantity (|12|) is the contribution to U{x, t) from cell i. 
The total postsynaptic potential is the summed contribu- 
tion from all presynaptic cells, plus the repeated sensory 
input: 

U{x,t) = cPix) + Y,wjit)hi^ - ^j) (13) 
Define / by 

N 

f{x,w{t)) = f{cp{x)+Y,w,{t)£,{x)). (14) 
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Then from Eq. Q and Eq. ifT^ wc have 



changing variables and rearranging yields 



at +Ci{x), density ^f{x,wi{t), . . . 
ai, probability 

1 ^ T So dxj{x,wi{t), 



,WN{t)). 

(15) 

Eq. p5|l defines the random walk for the weight dy- 
namics. It is discrete time (steps occur only ai t = nT, 
n S Z), continuous space (steps can take a continuum of 
values), and inhomogenous (step probabilities depend on 
position). 

o o 

The common periodicity of the functions £i, Ci and 4) 
is an important feature, allowing the systematic use of 
Fourier techniques. 



F(fc)[l 



1 



TJo 

X I dw' e'^""' f{x,w')P{w'). (18) 



A physiologically plausible spike output function / would 
take the form of a smooth, monotonically increasing sig- 
moid, but for maximal simplicity we assume / is piece- 
wise linear: 



0, u<-v-e 
/(«) = <! 2^(1 + -v-e<u< 

u>V -9 



2T 
!_ 

T 



V- 



(19) 



so that / is given by 



V. ONE WEIGHT 

To illustrate the technique in the simplest possible set- 
ting, we first examine the case of a single weight. If there 
is only one weight, wi{t), then without loss of general- 
ity we may take xi = 0, by translating </> if necessary. 

o o o o 

Writing 1^(1), a, C and £ for wi(t), ai, Ci and £i, the 
random walk Eq. H15|) for the weight dynamics becomes 



Aw(t) 



where 



a + C{x), density ^/(x, U'(t)) 
a, probability 1 — y Jq dx f{x,w{t)), 



(16) 



fix,wit)) ^ f{(^ix)+w{t)£{x)). 

From the random walk for the weight dynamics we 
derive the moments of the equilibrium weight distribu- 
tion in three steps. First we write the master equation 
for the time evolution of the probability distribution of 
the weight, and the corresponding functional equation 
for the equilibrium (stationary) distribution. Taking the 
Fourier transform yields a differential equation for the 
Fourier transform of the equilibrium distribution. Taylor 
expansion of this equation yields recurrence relations for 
the moments. 

Notice that the set of step values in the walk H16() is in- 
dependent of w; hence the equilibrium distribution P{w) 
must satisfy Eq. From the step values and step 

probabilities in Eq. lfTC|) we have 



P(w) 



1 r 

1 / dxf{x,w — a) P{w — a) 

T Jo 



+ - / dx f{x, w-{a + C{x)))P{w -{a + C{x))).{17) 
^ Jo 

Taking the Fourier transform j dw e^^"^ on both sides. 



f{x,w) = lM^ 



u{x)-e 

V 



U{x) <~V -9 
), -V -9 <U{x) <V -9 
U{x) > ^-6* 

(20) 

with U{x) ^ (t){x) -~9 + w£{x). 

Wc further assume that the equilibrium weight dis- 
tribution P{w) is zero or negligible for w such that 
U{x) < -V - 9 or U{x) > V - 9. This is a confine- 
ment condition on the equilibrium postsynaptic potential 
U{x), and will be justified later. Note that the confine- 
ment condition helps justify the piecewise linear assump- 
tion on /, since the more "confined" the postsynaptic 
potential U{x), the better our piecewise linear / approx- 
imates a smooth sigmoid in the region where U{x) is 
concentrated. If the confinement condition holds, then 
in Eq. H29|l we may replace f{x,w') under the integral 
by the following linear function of w: 



1 4>{x)~9 + w£{x) 
2T^ V 



Using / dwe''''"wP{w) = P'{k), we then obtain 



P{k) 1-e^^"-- dx-{l 



T 



</>(x) 



T 



V 

\£{x) 



-)ri{x) 



-P'ik)- / dx-^7j{x), (21) 



2 V 



where r){x) = e''=("+^(^) - e''^". By Eq. ©, the mo- 
ments of P{w) are (up to powers of i) just the derivatives 
of P{k) at k = 0; since those derivatives are implicitly 
constrained by Eq. (|21|l . the moments of P{w) are con- 
strained by Eq. 121|) . Specifically, the Taylor expansion 
of Eq. H21|) around k ^ will yield a hierarchy of re- 
currence relations for the derivatives of P{k), and hence 
for the moments of P{w). The Taylor expansions of the 
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exponentials are 



n— 



For the expansion of the characterisitic function P{k) 
we expand the exponential in the definition of P{k) and 
invert the order of summation and integration: 



P{k) = J dwe'^'^P{w) 

= V — fc" / dww'^Pi 

m=0 



m=0 

From this it follows that 



-J 00 ., 
P'(fc) = - ^ 



is. 

i 



I ^ — ' m! 

m=0 



^ — ^ m ' 



m=0 



By substituting these expansions into Eq. (|21() and 
equating coefficients of fc^ on both sides, wc obtain the 
following relations: 



m=0 



E m)bt™(^'") - 7,^-™(^'"+^>] =0, (22) 



M = 0,1,2,.. 
where for brevity we have defined 



it = Ko - «" 



1 



1 



dx -(1 + 



V 



-)((a + /:(x))"-a"). 



T 



In = 7^ dx-^{a + C{x)r-an- 



2 V 



The relations 1)22(1 are lower triangular'^, and hence are 
easily rearranged to yield explicit recurrence relations for 
the moments in terms of moments of lower degree only: 



-JjL- 

^l = 1,2,. 



l^Y.^wn^,..m, (23) 



where 



I /^i — m+l" 



We may now compute the central moments Mk - 
{{w — (w))''), by expressing (it;") in terms of the {Mk}- 

(w") = ((!«-(«;) + («;))") 



E 

fc=0 



i — k 



(24) 



Substituting into Eq. (|23|l and rearranging yields 



7i M7r 



1 ^ 



fc=2 



(25) 



For /i 


= 2, 3, 4 wc obtain 




M2 = 


27f 


1 7^72^ 
2(7f)2' 




M3 = 


1 

--^ + 
37f 


1 7^73^ , 
3 (7f )^ 


17^72^ l7f(72^)^ 
2(7f)2 2 (7f)3 ' 


A/4 = 


3 (72')' 

4 (7f )2 


3 7?7|72^ 
2 (7f)^ 


3(7iV(7f)' I74' 
4 (7f)4 4 7f 




, l7iV 


^ 1 72*73^ 


^ 1 73^2^ 




4(7f)2 


2(7f)2 


2(7f)2 



4> E E 

7i 72 73 



3 7l(72"y , 3 7f(72") 



E\3 



(7f)^ 4 (7f)3 ' 4 (7f)4 • (^^^ 

We can see from A/3 alone that in general the equi- 
librium weight distribution is not Gaussian. For generic 
PSP £ and learning rule C there are no polynomial re- 
lations amongst the coefficients 7^ and 7^, hence A/3 is 
generically nonzero. 

To determine the dependence of the moments on step 
size, we multiply both a and C, and hence the steps of 
the random walk, by a scalar A. The coefficients 7,^' and 
7^ arc then both 0(A"), and substitution into Eq. 
yields 



0(A), 
O(A^) 



M2 
A/3 

A/4 = 3A/|-|-0(A3) 



Hence as A ^ the skew and kurtosis approach Gaus- 
sian values: 



^ One could also derive moment equations via the more direct route 
of Taylor expanding, in w, the equilibrium condition IIYI for 
P{w); but the resulting moment equations are not triangular. In 
fact they are fully coupled (each equation involving all moments, 
in general) and hence not readily solvable. 



skew 



kurtosis 



A/3 



M.^ 

Ma 



3/2 



0{\^) 0, 



^,2-3 + 0(A) 
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VI. MULTIPLE WEIGHTS 

We now apply the technique to the case of multiple 
weights Wi, i = 1,2, . . . , N . The algebra is more com- 
plicated, but the structure of the derivation is identical 
to the single weight case. For notational compactness we 
introduce vector notation: 



wit) 





, a 


-{')' 






WW/ 










/ £i{x- 


xi)] 


, £(x) = 






\£n{x - 


xn)/ 




O 

\Cn{x 


- Xn)/ 



Six) 



The random walk for the weight vector wit) takes place 
in M^, with the walk for each component Wiit) given by 
Eq. (|16() . In vector notation the walk for wit) is then 



Awit) 
where 



a + Cix), density 
a, probability 1 — y dx fix,wit)), 



(27) 



fix,wit)) = fi(^ix)+wit)-£ix)). 



and • indicates the vector dot product. Again, the step 
sizes are independent of position, so the equilibrium con- 
dition Eq. Q applies. We have 



Piw) = 1 — — / dx fix, w — a) Piw — a) 
^ T Jq J 

dx fix, w-ia + Cix)))Piw - (a + £(x))).(28) 



As before, we take the (now n-dimensional) Fourier trans- 
form on both sides. Applying j dw e^^ '^ , changing vari- 
ables and rearranging yields 



P(fc)[l-e 
where 



ik-c 



= - / dxr^ix) 



77(a 



dw' e'^-"" fix,w')Piw'), 
(29) 

(30) 



We now assume the postsynaptic gain function / is piece- 
wise linear and given by Eq. H19() , hence / is given by Eq. 

o 

(P5|l . with Uix) = 4>ix) ~ + w ■ Six). And as before, we 
assume P(w) is negligible for it) such that Uix) < ~V ~6 
or Uix) > V — 9, a confinement condition on Piw), which 
will be justified later. Then we may replace fix, w') un- 
der the integral by the linear function of w 



2T^ 



w ■ Six) , 



Using / dw e^'^'^WjPiw ) — ^-g^ 
ing first-order PDF for P(fc): 



1 dP(k) ^ obtain the follow- 



P(fc)[l 



I r 1 

- / dx-il 
TJo 2^ 



1 dPik) 1 



TV 



dx 



V 

o 

1 £j ix — Xj) 



V 



)vix) 
Vix). (31) 



Taylor expansion of both sides of this equation around 
k = will yield recurrence relations for the moments of 
w. The Taylor expansion of a function g on is given 
by 



.g(fci,...,%) = E^ E 

9"g(zi, ...,zn) 



n 

si ■■ ■ sn 



. . . 



_T{K. (32) 



2=0 



The expansions of the complex exponentials in Eq. 131|) 
are thus 



Jk-{a+C{x)) 



n— s 

x\{ia + Cix))f\m\ (33) 

I I 

where in the sums on the right, s ~ (siS2 ■ • ■ sm)^ with 
each Si a nonnegative integer and = For 

brevity we write (") for the multinomial coefficient in 
Eq. (|S2J). 

As before, for the expansion of the characterisitic func- 
tion P(fc) we expand the exponential in the definition of 
P(fc) and invert the order of summation and integration: 

P(fc) = j dwe^'^-'^Piw) 

= EES(7)[/^-^(-)n-r]nfcr 

oc / \ ^ 

- EEb rV«^---"-)n^r,(34) 



where r ~ (rir2 . . . r^)'^ with each a nonnegative in- 
teger and X]i3=i ~ From this expansion of P(fc) it 
follows that 



1 dPjk) 
i dk-i 



EE 

rn—O r 



I' 3 



V 



(35) 
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Using the combinatorial identity 



ml \r J (7TI — 1)! \ri • • • — 1 • • • rjv/ 
we may reindex Eq. (|35|l to yield 

1 dP{k) 



I I m 

i dkj ^ ^ m\\r 

m— r ^ 



EE 



n^r- (36) 



When the expansions Eqs. (|34|) . and (|35|l are sub- 

stituted into Eq. equating the coefficients of Yli kf 

on both sides yields 



1 f^Ji 



E 



1 



n!m! \ r 



r-\-s—q 
N 

J = l 



(37) 



where 

it = 



1 

T 



involves the quantities 7!^ and 7.^. For generic £ and £ 
there are no polynomial relations amongst these quan- 
tities; hence the determinant of the coefficient matrix is 
generically nonzero, and the system can be inverted to 
give the moments of total order /i in terms of 7!'^, 7.^, 
and the moments of total order less than The com- 
plete moment hierarchy can thus be obtained: first mo- 
ments of total order 1, then moments of total order 2, 
and so on. 



A. Equilibrium Mean 

For — 1 we must have qj = 5ij for some i. Since in 
Eq. H38|) only terms with m < appear, and m = 7'j, 
the only possibility for r is r = 0, and then Sj = qj — Sij. 
The recurrence relation Eq. H38|l then becomes 



= «. + ^ dx-il 



N 



0(.t) 



— Ho 



V 



+ dx —8j{x ~ Xj)£i{x - Xi). {39) 

Allowing i to vary over all possible values 1, 2, . . . , TV, we 
have N linear equations in the N unknowns wt, which 
can be written in vector form as 



l_ 1^ ^^lEJx-x^ 
TJo 2 V 



(Y[ia + k^)ri'-U'^t'), 



and q = {qiq2 . . . qN)'^ , each qi a nonnegative integer, 



with X^iLi Qi — 1^- A slight simplification follows from 
7q = 1 and 7(f = 0: the quantity on the left side of Eq. 
(|37|l is cancelled by the term on the right side with s = 
and r = q. The resulting recurrence relations are 



) C{w) = d, 

with the matrix C and vector d given by 



(40) 



1 1 f ° ° 
— - / dx8j{x~Xj)C,{x-Xi), (41) 



d, = a,- / dx-{l 



V 



-)C,ix-x,). (42) 



The overall minus sign in the definition of C is for later 
convenience. For generic £ and C the matrix C is invert- 
ible, and we have (w) = C~^d. The physical meaning of 
this relation can be illuminated by rewriting Eq. (|39|l as 
follows: 



0- E 



n\m\ \ r 



r-\-s—q 

N 



-^7^^W- 
i=i 



< < /i. 



N 

E' 

1=1 



(38) 



For each choice of q we obtain a single linear equation 
involving moments of total order at most fj, = q^. Re- 
garding the moments of total order /i as unknowns, to be 
solved for in terms of moments of total order less than /i, 
we have a linear system with the same number of equa- 
tions as unknowns. The coefficient matrix of this system 



= a. 



TJo 



(l)(x) - + Y,f^^{wj)£j{x - Xj) o 



^1 dxil + — -)Ci{x-Xi) 



V 



= (^i + if I dx{f){x)Ci{x - Xi), 





where {f){x) we define to be the value of f{x) when w — 
(w). Now add and subtract Jo dx (/)(x) to obtain 



= [1-^/ dx{f)ix)] 



a. 



1 

TJo 

(Aw. 



dx (/) (x) (aj + Ci{x~ Xi)) 



(43) 
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We find that the cquihbrium mean weight vector (w) 
is that for which the mean weight change is zero for 
all weights. This condition is obvious on independent 
grounds, and could have been used to calculate (w) di- 
rectly, without recourse to the moment hierarchy rela- 
tions. But for moments of total order 2 or higher, trans- 
parent conditions such as this arc not available; in that 
case we have no choice but to solve Eq. (|38|) . 

Given the equilibrium mean weights (w) , we can calcu- 
late the equilibrium mean postsynaptic potential (U) (x) 



via 



{U){x) = (P{x)+£{x) ■ {w) 

= (t){x)+£{x)-C-^d, 
provided C is invertible. 

B. Equilibrium Variance 

We now take fj, = 2 and qt = Sik + Sjk in Eq. 
After some simplification, using C, d, (/) and (w) from 
above, we obtain 

N N 

= - ^ Cjk (wkWi) - ^ C,k {wkWj) 

k=l k=l 
1 

- {wt)dj - {wj)d^ + if dx 

o o 

X [{at + Ci{x- Xi)){aj + Cj{x - Xj)) - Uiaj] . 
This can be rearranged to give 

N N 

^ Cjk {wkWi) + ^ Cik {wkWj) 



k=l 



k=l 



N 



N 



Cjk{wk) - (wj) ^ Cik{wk) 



T7o 



k=l 
T 



dx {f){x) [(a,; + Ct{x - Xt)){aj + Cj{x - Xj))] 



1 



= —{AwiAwj). 
In vector form this becomes 



C{(ww')- {w){wy') + {{w') - {w){wf)C^ 



= (AwA 



w 



(44) 



The covariance of a vector random variable v is cow = 
{vv'^) — {v){v)'^. Equation H44|l then takes the compact 
form 



C(covi(j) + (covw)C'^ = cov Aw, 



(45) 



where we have used the equilibrium mean condition 
(Aw) = on the right side. Equation (|45|l is a Lyapunov 



equation |3l| for covw, giving the equilibrium weight co- 
variance in terms of C (which depends on £ and C) and 
covAif (which depends on (/), a, and C). Both C and 
cov Aw can be calculated from the parameters of the 
system, and then the equilibrium covariance covw, if it 
exists, must satisfy Eq. H45|l . 

A theorem of Ostrowski and Schneider [^J, 113 gives 
conditions for existence and uniqueness of solutions to 
Lyapunov equations. If S is symmetric positive definite 
and A and —A have no common eigenvalues, then the 
Lyapunov equation AH + HA^ = S has a unique so- 
lution H. Furthermore, H is symmetric, and has the 
same inertia (number of eigenvalues with positive, zero, 
or negative real part) as A. 

Since cov Aw is necessarily symmetric positive definite, 
the theorem says that a symmetric solution cov w to Eq. 
(|45|l exists uniquely provided C and — C have no common 
eigenvalues, and covw is positive definite if and only if 
all eigenvalues of C have positive real part. 

The condition that C and —C have no common eigen- 
values is true for generic C and hence for generic £ and 
C. The condition that covw be positive definite is needed 
in order to interpret covw as the covariance matrix of a 
probability distribution; we say covw is physical if it is 
positive definite. Denoting by the n"^ eigenvalue of 
C, we then have the following physicality condition: 

covw physical <J=J> Re > for all n. (46) 



A theorem of Heinz |3ll |33j says that if all eigenvalues of 
A have positive real part and all eigenvalues of B have 
negative real part, then the (unique) solution X to the 
equation AX — XB = Y is given by 



X 



dse-'^^Ye 



(47) 



where the matrix exponentials are defined via Taylor ex- 
pansions. The assumptions on the eigenvalues of A and 
B ensure that the integral in Eq. (|47|l converges, and 
one can show by direct substitution that the resulting X 
satisfies AX — XB = Y. If the physicality condition H4()|) 
holds, then C and —C'^ satisfy the conditions for A and 
B respectively, and we obtain 



covw 



dse (covAw)e' 



(48) 



This gives the equilibrium covariance matrix explicitly in 
terms of system parameters. 

Since the postsynaptic potential U (x) is a determinis- 
tic function of the synaptic weight vector w, the weight 
covariance cov w determines the covariance of the post- 

o 

synaptic potential. From U{x) = (t>{x) +£{x)w, we have 



cov {U (x), U (y)) = £{x)'^ covw £{y) 



(49) 



for any pair of times x, y in the interval [0, T]. Of partic- 
ular interest is the diagonal variance of U{x): 



cov {U (x), U{x)) — £{x)'^ covw £{x) 



(50) 
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Our derivation of the equihbrium moment hierarchy 
equations rehed on the equihbrium distribution of U{x) 
being neghgible on the "tails" of the postsynaptic spike 
probability function /. We will show in the next section, 
for the case of homogeneous parameters, that the con- 
finement condition on U{x) can be always be satisfied 
by adjusting the rates of associative and non-associative 
learning. 

Note that for a spatially extended psp £, Eq. H5U|) im- 
plies that the diagonal variance of U{x) depends on the 
full matrix covui; in other words, it depends not only on 
the diagonal variances of the synaptic weights w, but also 
on the off-diagonal correlations between different synap- 
tic weights. 



VII. MULTIPLE WEIGHTS, HOMOGENEOUS 
PARAMETERS 

For maximal generality in the foregoing analysis, we 
have allowed the postsynaptic potential functions and 
spike-timing dependent learning rules to be different 
for different presynaptic neurons, and have allowed the 
presynaptic spike times to be arbitrary. Further ana- 
lytical progress can be made in the case where the sys- 
tem parameters are homogeneous, i.e. the postsynaptic 
potential functions and spike-timing dependent learning 
rules are the same for all presynaptic neurons, and the 
presynaptic spike times are regularly spaced. 

For such parameters it will turn out that the matrix 
C, the coefficient matrix in the Lyapunov equation H45fl 
for covw, has a special form: it is circulant [3^ . The 
matrix cov Aw on the right side of the Lyaponov equa- 
tion for covw is not circulant in general; but it is circu- 
lant if the postsynaptic spike probability density (/) (x) 
is independent of x. Now it was shown in [T^ that in the 
case of homogeneous parameters, if the spacing 5 between 
presynaptic spike times is sufficiently small and provided 
certain other constraints hold, the (mean) equilibrium 
weight vector has the property that the mean total post- 
synaptic potential {U){x) is approximately constant^. In 
that case the mean equilibrium postsynaptic spike den- 
sity {f){x) is also approximately constant, and the ma- 
trix cov Aw is approximately a circulant matrix D. The 
Lyapunov equation for covw is then approximately 

C{coYw) + {covw)C^ = D, (51) 

with solution given by 

/>OC 

covw= / dse-^'^De-"'^^ . (52) 
Jo 



* The present model differs from the model in llSl in having a post- 
synaptic spike probability density instead of a mean postsynaptic 
spike rate, but the argument is unaffected. 



The eigenvalues and eigenvectors of circulant matrices 
are easily calculated; furthermore, all circulant matrices 
can be simulataneously diagonalized. Simultaneous di- 
agonalization of C, , and D in Eq. (|52|l will yield an 
explicit solution for covw in terms of the eigenvectors 
and eigenvalues of C and D, which will themselves be 
written as explicit functions of the system parameters. 

Let f (s), 'C(s), and a denote the common postsynaptic 
potential function, associative learning rule, and nonas- 
sociative learning rule respectively. Let the spike time 
for presynaptic cell i be Xj = (i — 1)6, i — 1,2, . . . , N, 
5 = T/N. Wc then have 

11/"^° ° 

and for (/) (x) approximately the constant (/) we have 
cov Aw ~ D, where 

A, = a'(l - (/)) 

1 /"^ ° ° 

+ {Dt^ dx{a + £{x - x.i)){a + £{x - Xj)). 
^ Jo 

o 

By periodicity of £, this can be simplified to 

+ / dxC{x~x,)Cix^Xj), (54) 

^ Jo 

where /3 = ^ dxC{x). 

A matrix A is circulant [s^ if each row of A equals the 
row above it shifted one entry to the right (and wrapped 
around at the edges); in other words 

^(i+l)modAf,(j + l)inod7V = ^ij for 

We now show that both C and D are circulant. First, 
let g{x) and h{x) be any periodic functions of x with 
period T, and let the {xi} be regularly spaced on [0,T] 
as defined above. Let A be the matrix defined by 

Aij ^ I dx f{x ~ Xi)g{x ~ Xj). (55) 
Jo 

Taking to ((i-l-l)mod N, {j + 1) mod iV) in Eq. ifS^ 
shifts the argument of both functions by ~S, and by pe- 
riodicity this does not change the value of the integral. 
Hence any matrix of the form H55|l is circulant. 

The constant matrices (all of whose entries are the 
same) are also circulant; and circulant matrices are closed 
under addition, scalar multiplication, and transposition. 
Hence by Eq. (|53|) and H54|) . C and D are both circulant, 
and so is . 

It is easily shown [s^l that the vectors 
1,2, ... ,N with components 

u["^ = e^"^'-^^"/"" , k = l,2,...,N (56) 
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are a complete set of eigenvectors for any circulant matrix 
A, with corresponding eigenvalue A„ given by 



N 



(57) 



1=1 



The expression on the right in Eq. H57|l is independent of 
j because Aji and the complex exponential both depend 
only on {j — I) mod N . It is easily checked from Eq. H57|l 
that adding a constant matrix (all entries the same) to a 
nonzero circulant matrix has no effect on its eigenvalues. 

Let R be the unitary matrix whose n}^ column is the 
vector and let A be the diagonal matrix with entries 
A„. Then 

A = RKR* 

where R* is the complex conjugate transpose of R. 

In the present context it will be convenient to define 
wavenumbers fc„ so that the argument of the complex 
exponential in Eq. H56|l ; this we can arrange by 

taking fc„ = 2im/T, n = 1,2, . . . ,N. From Eq. the 
eigenvalues of C and D arc then 

-^E^^'"^'^"''^T dxC{x-x,)S{x-xi), 



A,? = / dxCix^x, 

1=1 ^ -^0 



)C{x - xi). 



By periodicity of £ and C and regular spacing of the {xi}, 
these can be rewritten as 



a: 



c 



^T.^ tI dxC{x-xi)£{x),{b%) 



N 

1=1 



[ dxC{x-xi)l{x). (59) 
^ Jo 



Let A*^ and A^ be the diagonal matrices with entries A^ 
and A^, and let R be the unitary matrix defined above, 
with entries 



R-j 



Then C = Rk^ R* and D = RA^R*. Transposition 
takes eigenvalues to their complex conjugates, so C'^ = 
RA^R*. From RR* = I and Taylor expansion it follows 

that e"^ = Re'^" R* and e'^^ = Re'^R*. Substitution 
into Eq. H51() then yields a diagonalization of covw: 

covu; = R[ dse-"^''A^e"'*^'^]i?* 
Jo 

= RA'^R*, 

where A"" is the diagonal matrix with entries 

XW 

A„ — 



Xi 



2ReXC ' 



(60) 



provided ReX^ > 0. Since D is symmetric positive defi- 
nite (it is, by construction, a physical covariance matrix), 
we have X^ real and positive for all n. Recall that in or- 
der for the solution of the Lyapunov equation Eq. I|51|) 
to be positive definite, all eigenvalues of C must have 
positive real part, i.e. i?eA^ > for all n. If this physi- 
cality condition is satisfied, then the eigenvalues of covw 
given by Eq. (|60|l are real and positive. These eigenval- 
ues, with A^ and X^ given by Eq. (|S5|l and are the 
variances associated with the independent components of 
the equilibrium weight distribution. The corresponding 
eigenvectors are the u^"'\ with components u^"-* = e''^"^^ . 

Since > 0, the condition for physicality of the co- 
variance is 



N 

1=1 



T 



dx C{x — xi)£{x) < for all n. 



This coincides with the condition derived in for sta- 
bility of the mean weight state. Roughly speaking, it 
follows that if there exists an equilibrium weight distribu- 
tion P{w) (with finite covariance matrix), then the mean 
of the distribution must be stable. We do not address 
stability of the equilibrium distribution (or equivalcntly, 
stability of all moments of the equilibrium distribution) 
in the present paper, but a natural conjecture would be 
that if the equilibrium distribution P{'w) exists, then it 
is necessarily stable. 

From covti; = RA^ R* we can now write down explicit 
expressions for the equilibrium covariance of any pair of 
weights: 



N 



cov {wj,wi) 



n.ni=l 

N 

E ^JnRnXn 
n=l 
N 



(61) 



with A;^ given by Eq. ^ and A^, A^ given by Eq. ^ 
and 

Note that cov {wj, wi) depends on j and I only via the 
difference {xj — xi)m.O(lT, due to periodicity and trans- 
lational invariance of the architecture for homogeneous 
parameters. Also, the covariance of the weights depends 
only on the associative part C of the learning rule, since 
the nonassociative part a does not appear in Eq. H61|l . 
This is not surprising, since the role of a is essentially 
analagous to that of a constant externally applied force 
in a physical system. Such a force changes the position of 
the equilibrium, but does not alter the dynamics around 
the equilibrium. 
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A. Confinement 

Our derivation of the moment hierarchy relations Eq. 
(|38|l rehed on the assumption that the equihbrium weight 
distribution was neghgible on the "tails" of the piece- 
wise linear postsynaptic gain function /. This places 
a constraint on the mean {U){x) and diagonal variance 
cov {U{x), U (x)) of the postsynaptic potential: they must 
be such that the mean is a large number of standard de- 
viations away from the tails. For each x, let r{x) be the 
standard deviation of U (x) divided by the distance from 
{U){x) to the nearest tail, i.e. to V - 6 or -V - 9. The 
parameter r(x) will be referred to as the confinement pa- 
rameter for the system. The confinement condition holds 
provided {U){x) is in the interval {—V — d,V — 9) and 
r{x) ^ 1, for all x. 

We now argue that by adjusting only the rates of 
nonassociative and associative learning, the confinement 
condition can always be satisfied. Multiplying the asso- 
ciative learning rule by a positive scalar factor /3 and both 
nonassociative and associative components by a positive 
scalar factor A, we have weight changes given by 



Aw{t) 



Xa + pC{x), density y/(a;,w;(t)) 
Aa, probability 1 — y dx f{x,w{t)). 



(62) 

The ratio of associative to nonassociative learning rate is 
parametrized by /3, while the overall learning rate is pa- 
rameterized by A. Now it was shown in [T^ that in the 
case of homogeneous parameters, under certain mild con- 
ditions, the equilibrium mean weight vector has the prop- 
erty that {U){x) is approximately constant (i.e. the equi- 
librium is an approximate negative image state). Hence 
(/) in Eq. H43|) is approximately constant. If it were 
exactly constant then Eq. H43|) (for homogeneous param- 
eters) would yield, after cancelling A on top and bottom. 



a + ^ J dx C{x) 



Provided a and J dx C{x) have opposite sign (shown in 
|l8j to be necessary for existence of a negative image equi- 
librium) the right hand side of this equation can be made 
to have any desired value by appropriate choice of /3 > 0. 
Hence (/) can be made to have any desired value by ap- 
propriate choice of /3; in particular, a range of /3 exists for 
which (/) falls in the open interval {f{-V~e), f{V-0)). 
Since / is invertible for arguments in {—V — 9, V — 0) and 
(/) = /(([/)), it follows that by appropriate choice of (3, 
(U) can be made to have any value in {—V — d,V 6). 
Since {f){x) approximately constant implies (U) approx- 
imately constant, it follows that the mean postsynaptic 
potential {U){x) can always be made to lie between the 
tails, for all x. 

It remains to show that the diagonal variance 
cov {U (x) , U {x)) can be made sufficiently small so that 
the distribution of U{x) is negligible on the tails. We do 



this by holding (3 fixed and varying A. Since the matrix 
C is proportional to A and the matrix cov Aw is propor- 
tional to A^, it follows from Eq. I|44(l that covw, and 
hence covU from Eq. H54|) . is proportional to A. In par- 
ticular, COY {U (x) , U (x)) can be made arbitrarily small 
by taking A sufficiently small. 

Thus, by appropriate choice of f3 and A, the confine- 
ment condition can always be satisfied. The value of 
/3 determines the location of the mean postsynaptic po- 
tential, and the value of A determines the width of the 
distribution around the mean. The latter fact, that the 
width of the eqiulibrium distribution of the postsynaptic 
potential is proportional to the overall learning rate, has 
direct behavioral relevance to the mormyrid fish, since it 
implies a tradeoff between speed of adaptation and accu- 
racy of the adapted state^. 



B. Dense Spacing Limit 

In the architecture of mormyrid ELL, the spacing 5 
between presynaptic spike times is much less than the 
widths te, tl of the PSP £ and learning rule £. In 
the dense spacing limit the set of discrete weights per 
unit time {wi/T} corresponding to presynaptic spikes at 
times {xi} becomes a continuum weight density yV(y), 
with weight W{y)dy corresponding to presynaptic spike 
times between y and y-j-dy. Sums over Xi are replaced by 
integrals over y. The matrices C and D in Eq. (|51|) be- 
come infinite dimensional, with eigenvalues A^, A^ given 
by 



A^ 



1 



2VT 



dye'^-y / dxC{x-y)S{x),{Qi) 



- ^-^ dye^'^-y j\xC{x-y)hx). (64) 

for n = 0, 1, . . .. We introduce some useful notation. Let 
Trlh] be the sequence of Fourier coefficients for a func- 
tion h on [0, r], given by !FT[h]n = dy e'^^y h{y) with 
fc„ = 27r7i/r, 71 = 0, 1, . . .. Let *t denote convolution on 
the interval [0,T], [g h){x) = dy g(x — y)h{y). Let 

h denote the horizontal reflection of h, h{y) = h{—y). 
Then Eq. H63|) and (|64|l can be written as 



A 



1 



2VT 



Tt[C *T £]n, 



if) 



^ The fact that the variance is proportional to the learning rate is 
also true for inhomogeneous parameters, by the same argument. 
But the confinement of the mean postsynaptic potential {U){x) 
is unclear in that case, because the equilibrium is not necessarily 
an approximate negative image. Further work is required to 
characterize the equilibrium for inhomogeneous parameters. 
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Now wc invoke the Fourier convolution theorem Txlg * 
h] = TT[g\pT[h], and the fact that J-T\g\ = ^T[g\, where 
1 denotes the complex conjugate of z. This gives 



A 



1 



2VT 



if) 



(65) 
(66) 



The eigenvalues of the weight covariance arc therefore 



A 



w 



2Re AS* 



-{f)v- 



(67) 



Re [TT[C]nTT[£]n\ 

It follows that the covariance of W{y) and yV(z) is 





FIG. 4: PSP and learning rules used in the examples. Sta- 
bility requires 3 — 2\/2 < tl/te < S + 2\/2. Stable examples 
are drawn with solid lines; endpoints of the stable interval are 
drawn with dashed lines. Arbitrary units. 



cov (>V(y), W(z)) = J2 e'^"^^"'^ a; 



w 



n=0 



= -2.(/)T/^,-[-^l[^P£L](, - z), (68) 
Re [j^rmJ^riS]] 

where J^^^[h]{x) = (l/27r) e*'^"'^/i„ is the inverse 

Fourier transform on [0,r]. The covariance of the post- 
synaptic potential is then 



cov U{y, z) 



dx / dx' £{y ^ x)covyV{x,x')£{z ~ x') 
Jo 

2TT{f)V [ dx [ dx'£{y^x)£{z-x') 



has some nonzero "range" , given roughly by the widths 
of £ and £, and within this range one would expect 
the weights to necessarily have some nonzero correlation. 
The result just derived says that in certain exceptional 
cases this correlation may vanish. The result was derived 
in the dense spacing limit, but can be expected to hold 
approximately for the physical case of discrete spacing, 
and also to hold approximately for C not quite propor- 
tional to £] this will be verified in the examples calcu- 
lated below. Given that the best current experimental 
measurement of the learning rule in mormyrid ELL 
is not inconsistent with £ and C having the same func- 
tional form, this vanishing correlation phenomenon may 
have biological relevance. 



xj-rp [ \(x — X ). (69) 

Re [J^t[C]^t[£]] 

One special case is worth noting: suppose the PSP and 
learning rule have identical functional form, i.e. are pro- 
portional to one another, C{x) = c£{x) for some (real) 
constant c. Then we have 



[ „ ]W = = Y^^'' 

Re [J't[C]J't[£]] ^ 

where S{x) is the Dirac delta function. For such a learn- 
ing rule the covariance of the weight density is 



VIII. EXAMPLES 

We now compute the equilibrium weight covariances 
for a class of PSPs and learning rules consistent with 
those measured in mormyrid ELL, assuming homoge- 
neous parameters. The PSP we take to be an excitatory 
alpha function of width te , and the learning rule we take 
to be alpha function, depressive, and pre-before-post, of 
width T^: 



cov {W{y),W{z)) = -{f)VcS{y ^ z) 



(70) 



In particular, the covariance (and hence the correlation) 
of W(2/) and yV{z) is zero for y ^ z, hence weights cor- 
responding to different presynaptic spike times are sta- 
tistically independent. This is surprising, since the cou- 
pling of weights through the PSP £ and learning rule C 



£{x) 

C{x) = -Tie 



H{x), 
H{x), 



(71) 
(72) 



where H{x) is the Heaviside function, H{x) = 1 if x > 
and otherwise (Fig^)). In the above expressions both £ 
and C have been normalized to unit area, but to ensure 
confinement of the postsynaptic potential, the learning 
rule C (and hence the size of the learning steps) must be 
made sufficiently small so that the confinement condition 
is satisfied. 

It was shown in [l^ that in order for the mean weight 
dynamics to be stable near the (negative image) equilib- 
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FIG. 5: Diagonal variance of weights, for alpha function £ 
and £ and for various values of tl/te- The larger of tl and 
te was taken to be 0.2T in all cases. Diagonal variance vs 
tl/te, log- log plot. Dotted lines indicate the boundary of the 
stable interval, tl/te = 3 ± 2y/2. Dimensionless units. 



rium, the time constants te and t^ must satisfy 
3 - 2V2 < — < 3 + 2V2. 

TE 

For tl/te in this stable range, we calculated the equi- 
librium covariance of the synaptic weights and of the 
postsynaptic potential, and verified our predictions by 
direct Monte Carlo simulation of the underlying random 
walk. The number of presynaptic cells was taken to be 
N = 50, and to ensure that the confinement condition 
was well satisfied, the rates of nonassociative and asso- 
ciative learning were adjusted so that the confinement 
parameter was r{x) = 0.2 for all x (i.e. the tails were 
5 standard deviations away from the mean postsynaptic 
potential). By translational symmetry for homogeneous 
parameters, the diagonal variances coy (wi^Wi) are inde- 
pendent of i, and the off-diagonal covariance cov {wi,Wj) 
depends only on (xi — Xj) modT. The covariance matrix 
is then completely described by the diagonal variance (a 
single number) and the correlation of weight Wi with the 
"midpoint" weight wn/2, for i = 1, 2, . . . , iV; the corre- 
lation in this case is just the covariance normalized by 
the diagonal variance. The diagonal variance is shown in 
Fig. |5l and the correlation is shown in Fig. |S1 for vari- 
ous values of t^/te between 3 — 2^2 and 3 + 2\/2. Note 
the approximate vanishing of off-diagonal correlation for 
tl/te near 1, as expected from the analytic calculation 
in the dense-spacing limit. The manner in which the cor- 
relation deviates from an approximate delta function as 
tl/te deviates from 1 also shows an interesting pattern: 
for tl I Te slightly greater than 1, the near-diagonal (near- 
neighbor) correlation is positive, while for tl/te slightly 
less than 1, the near- neighbor correlation is negative. But 
for tl/te substantially greater than or less than 1, the 
near-neighbor correlation is positive in both cases. The 



magnitude of off-diagonal correlation tends to increase as 
tl/te moves away from 1 in either direction. Near the 
limits of the stable range of t^/te, the near-neighbor 
correlation is close to 1 and the "antipodal" correlation 
(correlation with weights a half period away) is close to 
— 1. Such strong long-range corrclation/anticorrelation 
was also observed numerically in |l7j in mean weight dy- 
namics for parameters near the boundary of the stable 
region, with breakdown of stability being characterized 
by the appearance of travelling waves. 

The correlation of the postsynaptic potential is shown 
in Fig. [71 For t^/te near 1 the correlation is every- 
where positive. As tl/te deviates from 1, the correla- 
tion decreases, and long-range anti-correlations appear. 
As tl/te deviates still further, the anti-correlation de- 
creases in range and increases in magnitude, and a pos- 
itive long-range correlation appears. For t^/te near the 
limits of the stable range, the mid-range and long-range 
(antipodal) correlations approach —1 and -1-1, respec- 
tively, similar to the behavior of the synaptic weight cor- 
relation. The "scalloped" appearance of these curves for 
large /te is due to te being not much larger than the 
spacing 5 = T/50 between presynaptic spike times, re- 
sulting in only marginal overlap of adjacent PSPs. For 
fixed PSP width te, such scalloping should vanish as 
the spacing of presynaptic spike times goes to zero. It 
is believed [C.C. Bell, private communication] that in 
mormyrid ELL the spacing of presynaptic spike times is 
sufficiently dense that this scalloping would be insignifi- 
cant. 

Comparison with direct Monte Carlo simulation of the 
random walk revealed excellent agreement with predic- 
tion, provided confinement was well satisfied; results for 
tl/te = 5.814, near the upper end of the stable range, 
are shown in Fig. [S] As above, nonassociative and as- 
sociative learning rates were adjusted so that the con- 
finement parameter r{x) was 0.2 for all x (i.e. the tails 
were five standard deviations away from the equilibrium 
mean). Weights were taken to be initially uncorrelated, 
with mean equal to the predicted mean and variance 
equal to the predicted (diagonal) variance; the initial cor- 
relation was then the discrete Dirac delta function. To 
quantify convergence we used the mean absolute value 
of the relative discrepancy between the predicted and 
actual (ensemble mean) correlation. Translation invari- 
ancc of the correlation allowed us to reduce the size of 
fluctuations in the simulation estimate by averaging not 
just over the ensemble but also over the population of 

= 50 weights in each member of the ensemble^. Using 
this measure, the correlation in the simulation converged 
to within 1 to 2 percent of the predicted correlation in 



Although the predicted correlation is translation invariant, the 
fluctuations around the prediction are not necessarily uncorre- 
lated. For our purposes this is harmless; it simply means that 
we don't obtain as large a reduction in fluctuation size by popu- 
lation averaging as we would by using a 50-times larger ensemble. 
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FIG. 6: Correlation of weights, for alpha function £ and £ 
and for various values of tl/te- The larger of tl and te was 
taken to be 0.2T in all cases. Curves are labelled by the value 
of tl/te, and for clarity curves are not joined to the point 
(0.5,1) which all curves have in common, (a) Correlation of 
Wi with U)jv/2, versus Xi/T, for tl/te significantly less than 
1. (b) Same, for tl/te significantly greater than 1. (c) Same, 
for tl/te near 1, with expanded vertical scale. Dimensionless 
units. 



FIG. 7: Correlation of postsynaptic potential, for alpha func- 
tion £ and C and for various values of tl/te- The larger of 
Tl and te was taken to be Q.2T in all cases, (a) Correlation 
of U{x) with U{T/2), versus x/T, for tl/te significantly less 
than 1. (b) Same, for tl/te significantly greater than 1. (c) 
Same, for tl/te near 1, with expanded vertical scale. Curves 
are labelled by the value of tl/te- Dimensionless units. 
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FIG, 8: Convergence of weight correlation to predicted equi- 
librium values in Monte Carlo simulation, for L/E = 5,81, 
A'^ = 50, confinement parameter = 0,2, (a) Time-evolution 
of population-averaged correlation; curves labelled by time, 
t/T. Dotted curve indicates prediction, (b) Relative discrep- 
ancy between predicted and actual correlation, vs, time t/T. 
Dimensionless units. 



approximately 10^ timesteps (Fig. 



IX. SUMMARY 

Since changes in synaptic weights in STDP are due to 
temporally discrete events (spikes or spike pairs), the dy- 
namics of such plasticity, in the presence of noise, is nat- 
urally modelled as a discrete-time random walk. There is 
a large body of mathematical technique for the analysis 
of such processes |20| . 

From the weight dynamics expressed as a random 
walk one can write down a master equation for the time 
evolution of the weight probability distribution. From 
the master equation we obtain a functional equation for 
the equilibrium weight distribution. Taking the Fourier 
transform of this equation yields a differential equation 



for the characteristic function of the equilibrium distri- 
bution, and Taylor expansion then yields a hierarchy of 
recurrence relations for the equilibrium moments. From 
the moments of the equilibrium weight distribution we 
also obtain the moments of the postsynaptic membrane 
potential. 

For the case of a single weight, we explicitly calculate 
moments up to fourth order. The distribution is shown to 
be generically non-Gaussian, but the skew and kurtosis 
approach Gaussian values as the learning rate (size of 
steps) goes to zero. 

For the case of multiple weights we explicitly calculate 
moments up to second order. The mean weight vector 
satisfies a simple matrix- vector equation, which is equiv- 
alent to the condition that the mean step in the equilib- 
rium state is zero, for all weights. The weight covariance 
matrix satisfies a Lyapunov equation. An explicit solu- 
tion to this equation, in the form of a matrix integral, is 
obtained. For this solution to be the covariance matrix 
of some probability distribution it must be positive def- 
inite, which imposes a constraint on the PSP £ and the 
associative learning rule C 

For the case of multiple weights with homogeneous pa- 
rameters, further analytical progress can be made. The 
Lyapunov equation for the weight covariance matrix can 
be fully diagonalized, and the covariance of any pair of 
weights found in closed form. From this we also obtain 
explicit expressions for the covariance of the postsynaptic 
potential between any pair of times. The physicality con- 
dition, that the weight covariance matrix be positive def- 
inite, takes an especially simple form in this case, closely 
related to the condition derived in for stability of the 
mean weight state. 

In the limit of dense spacing of presynaptic spike times, 
the expression for the weight covariance is further sim- 
plified. In the special case where £ and C have the same 
functional form, we find, surprisingly, that weights cor- 
responding to distinct presynaptic spike times are statis- 
tically independent. This result can be expected to hold 
approximately for discrete presynaptic spike times, and 
for learning rules not quite identical to £ in functional 
form. 

Numerical calculation of the equilibrium weight co- 
variance and postsynaptic potential covariance was car- 
ried out for a class of examples relevant to mormyrid 
ELL: both £ and C alpha function in form, with £ 
excitatory and C depressive pre-before-post. For the 
synaptic weights, off-diagonal correlation is near zero for 
tl/te — 1, and tends to increase in magnitude as tl/te 
moves away from 1. Values of t^/te near the bound- 
ary of the stable range show large long-range anticorre- 
lations. The correlation of the postsynaptic potential is 
everywhere positive for tl/te = 1, but long-range anti- 
correlations develop as t^/te moves away from 1. These 
numerical predictions were found to be in excellent agree- 
ment with direct Monte Carlo simulation of the underly- 
ing random walk. 
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